AITopics | final output

Collaborating Authors

final output

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Proof and Derivations

Neural Information Processing SystemsFeb-11-2026, 10:15:09 GMT

However, the underlying clean model doesn't always exist for imperfect model Theorem A.1 (Necessary and Sufficient conditions for the existence of the underlying clean model.) . This theorem is a straightforward corollary of Bochner's I. (26) We can also expand the hessian of the log q We can then prove Theorem 2.3. All the experiments conducted in this paper are run on one single NVDIA GTX 3090.

artificial intelligence, derivation, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication

Qian Yu, Mohammad Maddah-Ali, Salman Avestimehr

Neural Information Processing SystemsNov-21-2025, 13:31:10 GMT

For example, replicating the straggling task on another available node is a common approach to deal with stragglers (e.g., [

artificial intelligence, machine learning, polynomial code, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Counterfactual-based Agent Influence Ranker for Agentic AI Workflows

Giloni, Amit, Picardi, Chiara, Betser, Roy, Bose, Shamik, Sabapathy, Aishvariya Priya Rathina, Vainshtein, Roman

arXiv.org Artificial IntelligenceOct-30-2025

An Agentic AI Workflow (AAW), also known as an LLM-based multi-agent system, is an autonomous system that assembles several LLM-based agents to work collaboratively towards a shared goal. The high autonomy, widespread adoption, and growing interest in such AAWs highlight the need for a deeper understanding of their operations, from both quality and security aspects. To this day, there are no existing methods to assess the influence of each agent on the AAW's final output. Adopting techniques from related fields is not feasible since existing methods perform only static structural analysis, which is unsuitable for inference time execution. We present Counterfactual-based Agent Influence Ranker (CAIR) - the first method for assessing the influence level of each agent on the AAW's output and determining which agents are the most influential. By performing counterfactual analysis, CAIR provides a task-agnostic analysis that can be used both offline and at inference time. We evaluate CAIR using an AAWs dataset of our creation, containing 30 different use cases with 230 different functionalities. Our evaluation showed that CAIR produces consistent rankings, outperforms baseline methods, and can easily enhance the effectiveness and relevancy of downstream tasks.

agent, artificial intelligence, query, (16 more...)

arXiv.org Artificial Intelligence

2510.25612

Genre: Research Report (1.00)

Industry:

Health & Medicine > Consumer Health (0.93)
Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

A Pragmatic Way to Measure Chain-of-Thought Monitorability

Emmons, Scott, Zimmermann, Roland S., Elson, David K., Shah, Rohin

arXiv.org Artificial IntelligenceOct-29-2025

While Chain-of-Thought (CoT) monitoring offers a unique opportunity for AI safety, this opportunity could be lost through shifts in training practices or model architecture. To help preserve monitorability, we propose a pragmatic way to measure two components of it: legibility (whether the reasoning can be followed by a human) and coverage (whether the CoT contains all the reasoning needed for a human to also produce the final output). We implement these metrics with an autorater prompt that enables any capable LLM to compute the legibility and coverage of existing CoTs. After sanity-checking our prompted autorater with synthetic CoT degradations, we apply it to several frontier models on challenging benchmarks, finding that they exhibit high monitorability. We present these metrics, including our complete autorater prompt, as a tool for developers to track how design decisions impact monitorability. While the exact prompt we share is still a preliminary version under ongoing development, we are sharing it now in the hopes that others in the community will find it useful. Our method helps measure the default monitorability of CoT - it should be seen as a complement, not a replacement, for the adversarial stress-testing needed to test robustness against deliberately evasive models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.23966

Genre: Research Report (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)

Add feedback

Code Execution as Grounded Supervision for LLM Reasoning

Jung, Dongwon, Zhou, Wenxuan, Chen, Muhao

arXiv.org Artificial IntelligenceOct-21-2025

Training large language models (LLMs) with chain-of-thought (CoT) supervision has proven effective for enhancing their reasoning abilities. However, obtaining reliable and accurate reasoning supervision remains a significant challenge. We propose a scalable method for generating a high-quality CoT supervision dataset by leveraging the determinism of program execution. Unlike existing reasoning dataset generation methods that rely on costly human annotations or error-prone LLM-generated CoT, our approach extracts verifiable, step-by-step reasoning traces from code execution and transforms them into a natural language CoT reasoning. Experiments on reasoning benchmarks across various domains show that our method effectively equips LLMs with transferable reasoning abilities across diverse tasks. Furthermore, the ablation studies validate that our method produces highly accurate reasoning data and reduces overall token length during inference by reducing meaningless repetition and overthinking.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2506.10343

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

4a4a3c197deac042461c677219efd36c-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 15:16:15 GMT

arg min kl, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.70)

Add feedback

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

Krishna, Satyapriya, Zou, Andy, Gupta, Rahul, Jones, Eliot Krzysztof, Winter, Nick, Hendrycks, Dan, Kolter, J. Zico, Fredrikson, Matt, Matsoukas, Spyros

arXiv.org Artificial IntelligenceSep-23-2025

The safety and alignment of Large Language Models (LLMs) are critical for their responsible deployment. Current evaluation methods predominantly focus on identifying and preventing overtly harmful outputs. However, they often fail to address a more insidious failure mode: models that produce benign-appearing outputs while operating on malicious or deceptive internal reasoning. This vulnerability, often triggered by sophisticated system prompt injections, allows models to bypass conventional safety filters, posing a significant, underexplored risk. To address this gap, we introduce the Deceptive Reasoning Exposure Suite (D-REX), a novel dataset designed to evaluate the discrepancy between a model's internal reasoning process and its final output. D-REX was constructed through a competitive red-teaming exercise where participants crafted adversarial system prompts to induce such deceptive behaviors. Each sample in D-REX contains the adversarial system prompt, an end-user's test query, the model's seemingly innocuous response, and, crucially, the model's internal chain-of-thought, which reveals the underlying malicious intent. Our benchmark facilitates a new, essential evaluation task: the detection of deceptive alignment. We demonstrate that D-REX presents a significant challenge for existing models and safety mechanisms, highlighting the urgent need for new techniques that scrutinize the internal processes of LLMs, not just their final outputs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.17938

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.93)
Government > Military (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Models and Language Model Prompting for the Multidimensional Evaluation of Open-Ended Conversations

Elizabeth, Michelle, Kasicka, Alicja, Krawczyk, Natalia, Ochs, Magalie, Lecorvé, Gwénolé, Gromada, Justyna, Rojas-Barahona, Lina M.

arXiv.org Artificial IntelligenceSep-3-2025

The growing number of generative AI-based dialogue systems has made their evaluation a crucial challenge. This paper presents our contribution to this important problem through the Dialogue System Technology Challenge (DSTC-12, Track 1), where we developed models to predict dialogue-level, dimension-specific scores. Given the constraint of using relatively small models (i.e. fewer than 13 billion parameters) our work follows two main strategies: employing Language Models (LMs) as evaluators through prompting, and training encoder-based classification and regression models. Our results show that while LM prompting achieves only modest correlations with human judgments, it still ranks second on the test set, outperformed only by the baseline. The regression and classification models, with significantly fewer parameters, demonstrate high correlation for some dimensions on the validation set. Although their performance decreases on the test set, it is important to note that the test set contains annotations with significantly different score ranges for some of the dimensions with respect to the train and validation sets.

dimension, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.00841

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Documenting Deployment with Fabric: A Repository of Real-World AI Governance

Jorgensen, Mackenzie, Brogle, Kendall, Collins, Katherine M., Ibrahim, Lujain, Shah, Arina, Ivanovic, Petra, Broestl, Noah, Piles, Gabriel, Dongha, Paul, Abdulhussein, Hatim, Weller, Adrian, Powers, Jillian, Bhatt, Umang

arXiv.org Artificial IntelligenceSep-1-2025

Artificial intelligence (AI) is increasingly integrated into society, from financial services and traffic management to creative writing. Academic literature on the deployment of AI has mostly focused on the risks and harms that result from the use of AI. We introduce Fabric, a publicly available repository of deployed AI use cases to outline their governance mechanisms. Through semi-structured interviews with practitioners, we collect an initial set of 20 AI use cases. In addition, we co-design diagrams of the AI workflow with the practitioners. We discuss the oversight mechanisms and guardrails used in practice to safeguard AI use. The Fabric repository includes visual diagrams of AI use cases and descriptions of the deployed systems. Using the repository, we surface gaps in governance and find common patterns in human oversight of deployed AI systems. We intend for Fabric to serve as an extendable, evolving tool for researchers to study the effectiveness of AI governance.

ai system, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.14119

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre:

Questionnaire & Opinion Survey (0.88)
Research Report (0.82)
Workflow (0.69)
Personal > Interview (0.66)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Adversarial Manipulation of Reasoning Models using Internal Representations

Yamaguchi, Kureha, Etheridge, Benjamin, Arditi, Andy

arXiv.org Artificial IntelligenceAug-29-2025

Reasoning models generate chain-of-thought (CoT) tokens before their final output, but how this affects their vulnerability to jailbreak attacks remains unclear. While traditional language models make refusal decisions at the prompt-response boundary, we find evidence that DeepSeek-R1-Distill-Llama-8B makes these decisions within its CoT generation. We identify a linear direction in activation space during CoT token generation that predicts whether the model will refuse or comply -- termed the "caution" direction because it corresponds to cautious reasoning patterns in the generated text. Ablating this direction from model activations increases harmful compliance, effectively jailbreaking the model. We additionally show that intervening only on CoT token activations suffices to control final outputs, and that incorporating this direction into prompt-based attacks improves success rates. Our findings suggest that the chain-of-thought itself is a promising new target for adversarial manipulation in reasoning models. Code available at https://github.com/ky295/reasoning-manipulation.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.03167

Genre: Research Report > New Finding (0.86)

Industry:

Law (1.00)
Information Technology > Security & Privacy (0.94)
Health & Medicine > Therapeutic Area > Immunology (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback